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Abstract 

Correlation and multiple regression analyses revealed the extent to which Education Week 's 
Quality Counts 2002 report card grades were intercorrelated and predicted by demographic 
factors among states. The researchers used four grades from Quality Counts 2002 and five 
demographic factors. The researchers found few intercorrelations among Quality Counts 2002 
grades, which suggested that the report cards are comprehensive. The multiple regression 
equations showed that demographic factors accounted for from four to 66 percent of the variance 
in Quality Counts 2002 grades. This suggested that the report cards are moderately functional, 
i.e., they have potential to guide policy development. The researchers conclude that the Quality 
Counts report cards are useful for provoking discussion but not for resolving educational policy 
disputes. 
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The Quality Counts 2002 State Report Cards: Correlations with Measures of State Education 

Environments 

Accountability has long been an issue in K-12 education in the United States. Arguments 
on the pro side range from a Commissioner of Education’s concern in the early 1960s that 
“American education had not yet faced up to the question of how to determine the quality of 
academic performance in the schools” and thus lacked the “support to shore up educational 
weakness” (Vinovskis, 1998, p. 5) to President George W. Bush’s characterization of strong 
accountability measures as the best alternative to “the soft bigotry of low expectations” (White 
House, 2001). On the other side, opposition has ranged from the 1970 argument against the 
National Assessment of Educational Progress (NAEP) that evaluation is far more useful in local 
systems than on a national level (Vinovskis, 1998, p. 8) to McNeil’s (2000) claims of the harm 
done by Texas’s accountability system. 

One form of accountability system is the state report card. In defense of organization 
report cards, Gormley and Weimer (1999, p. 21) argued that they can reduce “information 
asymmetry” (where service providers understand what they are doing better than the client) and 
thus enable both clients and public managers to pressure providers for better services. In fact, 
the Arizona Education Association (2002, p. 13) credited Arizona’s ranking of 50th on 
Education Week's Quality Counts report card for helping “build the momentum” for a sales tax 
increase to improve education there. But McDonnell (1994) explained how thoroughly 
providers may disagree with public managers over the meaning of any given information. She 
also described the need policy makers feel to move “quickly while the policy window is open” 
(p. 401) and to base decisions on quality-of-education indicators despite their technical 
limitations. 
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On the negative side, detractors have argued that quality-of-education indicators are at 
worst harmful, and at best necessary but insufficient sources of guidance. Those who claim they 
are harmful cite practitioners’ narrow focus on indicators and the fact that assessment systems 
cannot validly both enlighten practitioners and sanction them (McDonnell, 1994; McNeil, 2000) 
and their unintended negative consequences (Haney, cited in Scheurich, Skrla, & Johnson, 

2000) . Those who argue that quality-of-education indicators are necessary but insufficient 
guides claim that without attention to the process by which indicators and policy are translated 
into classroom practice (Knapp, 1997), to capacities of educational systems to change (Fullan, 

2001) , or to the environments in which they operate (Ginsberg & Berry, 1997), their adherents 
overlook critical issues. 

Gormley and Weimer (1999) offered six criteria forjudging report cards: validity, 
comprehensiveness, comprehensibility, relevance, reasonableness, and functionality. Two 
criteria of interest in our study, comprehensiveness and functionality, refer respectively to the 
facts that a report card should “be comprehensive in terms of important dimensions of 
organizational performance and . . .include a range of indicators” (p. 36) and that “a report card 
should be crafted in such a way that it convinces targeted organizations to engage in appropriate . 
. .behavior” (p. 37). Relevant to comprehensiveness, Ewell (2001) stated that subcategories that 
comprise a larger category should be neither too closely correlated nor too negatively correlated; 
a moderate correlation is best. High correlations would suggest that subcategories (e.g., report 
card grades) were assessing the same trait, and thus were redundant; negative correlations would 
suggest that high scores on one trait might actually work against achieving high scores on 
another trait. Kennedy (1999) categorized indicators of student outcomes and described the 
limitations of each category. 
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The visibility of state K-12 report cards has been elevated over the last seven years by the 
annual publication entitled Quality Counts, produced by Education Week (2002). In an 
analogous effort, the National Center for Public Policy in Higher Education (NCPPHE) (2000) 
produced higher education report cards for each state. Quality Counts provokes the question of 
the report cards’ functionality, i.e., whether states’ grades in K-12 education can inform 
policymakers’ decisions. Are grades related to policymakers’ decisions, or to demographic 
conditions that are beyond their control? Cunningham and Wellman (2001) addressed the 
identical question with regard to NCPPHE’ s higher education state report cards. Ackoff (1994) 
discussed the surroundings of an open system, and defined the contextual conditions of that 
system as those things that cannot be directly influenced or controlled. Richardson, Reeves- 
Bracco, Callan, and Finney (1999) described these factors as comprising the policy environment. 
Ackoff also defined variables that can be influenced or perhaps controlled as part of the 
transactional environment. 

Scholars debate the relative weight of demographics and policy in influencing the quality 
of education. Two demographic factors often thought to influence education are wealth and 
ethnicity. With regard to wealth, Wenglinsky (1997) argued that since the Coleman report, 
sociologists tend to emphasize the importance of the environment over anything that schools can 
determine. As for ethnicity, differences between minority and majority standardized test scores 
are well documented (e.g., Borman, Stringfield, andRachuba, 2000; College Board, 1999; Lee, 
2002). Oakes and her associates (1990) revealed that the additive effects and interactions among 
income, minority status and school factors were complex and practically inseparable. In the 
policy arena, the research of Odden, Monk, Nakib, and Picus (1995) revealed little variation in 
how districts spend money. They claimed, therefore, that it was unsurprising that student 
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achievement did not correlate with spending patterns. They also argued that hiring larger 
numbers of teachers either to reduce class size or to provide more out-of-classroom services has 
not boosted student achievement very much. Wenglinsky (1997), however, claimed from his 
LISREL analysis of NAEP data and Common Core of Data for U.S. schools that spending 
money to reduce class size will result in higher student achievement. Finn (2002) recently 
presented positive effects of class size reduction from a randomized study, but offered three 
cautions: (a) there is a right way and a wrong way to do it (the wrong way is without professional 
development and support); (b) reduced class size (which directly effects student/teacher 
interaction) is a far better statistic than pupil/teacher ratio (which addresses only the number of 
teachers in the building)', and (c) more research is needed on the effect of reduced class size. 
Muller and Schiller (2000), investigating the effects of testing policies on states, ran hierarchical 
linear modeling on the National Education Longitudinal Study, 88-92 and the National 
Longitudinal Study of Schools and found inconsistent effects. They interpreted their findings as 
fully supporting neither side of the testing policy debate, “neither the broad claims of proponents 
of greater equity nor the grim forecasts of opponents” (p. 211). Even when effective policies are 
identified, they may be costly to implement. Green (1994) stated that inherent in policy is the 
difficultly of simultaneously satisfying all needs and maximizing all outcomes. He in fact 
emphasizes his opinion that greater equity may mean a tradeoff in quality. Evan (1996) noted 
also that the effect of a change make take so long to appear that time is needed to properly 
evaluate it. So there appears to be no clear picture of the relationships between demographics or 
policy on the one hand, and quality of education on the other. 
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Purpose 

Our basic question is to what extent the Quality Counts 2002 report cards provide valid 
information to clients and public managers. In this paper, we address two research questions: (a) 
how comprehensive are the report card grades, i.e., do they appear to cover a range of concerns, 
or are they highly intercorrelated? This question addresses comprehensiveness, (b) are there 
environmental factors related to states’ grades? This second question implies its converse: is 
there room for policy to improve grades? This question addresses functionality. 

We were interested in looking at variables that define geographic, economic, or 
demographic climates in the states. Many of these variables cannot be directly controlled by the 
state or K-12 education. We also wanted to consider what we perceived as important variables 
that were not necessarily within the control of K-12 education but could be controlled by the 
state. Through our two research questions, we hope to initiate a discussion about the relationship 
among Quality Counts 2002 grades and between specific grades and a limited number of 
possible predictor variables. Another researcher may have chosen different variables for 
different reasons, but our purpose was to create a starting point. We intend to continue our 
analysis and hope that others find ways to refine, critique or expand what we did to stimulate 
scholarly inquiry and uncover relevant relationships that have implication for policy or at least 
for discussion of the usefulness of state report cards. 

We believe our analysis offers the potential to raise several policy-relevant questions. 

The impetus for first trying to understand the relationship between report card categories was to 
see if these relationships were intuitive. If a significant relationship exists, does it make sense? 
If not, is it possible that the construction of the category needs revision? These are important 
questions since state-level policymakers are the intended audience of such report cards. 
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The second line of inquiry focuses on the relationship between select elements of a state's 
educational context and its report card performance. Strong relationships between a report card 
grade and demographic factors largely outside state policymaker’s control may signify 
policymakers’ inability to affect that grade. 

A Review of the Report Card Categories 

The brief review of the definition of graded categories contained in the Quality Counts 
2002 report card serves as a summary for the reader. It is not the intent of this article to fully 
define each subcategory and its mathematical contribution to the category grade. Nor is it our 
purpose to debate the definition and legitimacy of any category, subcategory or the placement 
and inclusion of any subcategory until after we completed our analysis. Complete and full 
descriptions of the K-12 report card categories and subcategories are available in the Quality 
Counts 2002 report card (Education Week, 2002). 

• Student achievement. The percentage of 4 th and 8 th graders who score above "proficient" on 
the year 2000 National Assessment of Educational Progress (NAEP) exam in reading, 
writing, and math examinations. Several states were missing data for Student Achievement, 
so we focused on 8 th grade scores. We added the three scores of 8 th grade performance on 
reading, writing, and mathematics [ginfor each state. Any state with a missing grade in any of 
these categories was excluded. Thirty states had complete information on 8 th grade 
performance. 

• Standards and accountability. The extent to which a state 1) has developed educational 
standards, 2) uses various assessment techniques to measure those standards, and 3) holds 
schools and students accountable for performance. 
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• Teaching Quality. How well states assess teachers and assure that those teaching are 
teaching in their field of expertise, and the level of professional support and training for 
teachers' experiences. Also included as a portion of this grade is the effectiveness of teacher 
education in the state. 

• Resources. This area is broken down into two subcategories: Adequacy and Equity. Each of 
the two areas had a complete and separate grade, so both were used as K-12 grades. 

Adequacy measures a state's spending effort on education, including spending per student 
and percentage of taxable dollars devoted to education. Equity measures a state's effort to 
equalize per-pupil funding levels across districts. 

Method 

Our first research question, testing for relationships between report card categories, was 
fairly straightforward. We conducted a two-tail Pearson correlation analysis among five report 
card categories from Quality Counts 2002. We were primarily interested in looking at the 
relationships to see if they made intuitive sense. In addition, significant relationships might 
translate into important policy considerations. For example, a significant negative correlation 
between two categories might indicate a state’s difficulty in obtaining high grades in both of 
those categories, outlining inherent tradeoffs in state educational emphasis and investment. This 
would not, however, necessarily call into question the validity of either report card category; it 
might instead be a sign of the report card’s comprehensiveness. 

The second research question, which entails using state contextual variables as predictors 
for the report card categories, was more complex. We were not prepared to state hypotheses. 
Rather, our intent was to take a step toward developing hypotheses. This meant that our 
investigation was exploratory and that a regular multiple regression was not appropriate 
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(McNeil, Newman, and Kelly, 1996). A stepwise regression was the prime methodological 
candidate for the analysis, and we chose to use backward stepwise regression. Backward 
stepwise regression takes all independent variables and deletes those that aren't contributing to 
the goodness of fit. We did not use forward stepwise regression because we did not postulate 
which variables were most salient (McNeil, Newman, and Kelly). 

Since our premise was to test report card categories for the 50 states, the maximum 
number of observations was fifty. We did not include a regression equation for Standards and 
Accountability, since we assumed that this is virtually totally a function of state policy making, 
and unrelated to any contextual varibles. Of the four K-12 report card categories we tested, only 
achievement did not contain grades for all 50 states, for reasons stated in the previous section. 
We wanted to outline possible issues that might arise because of our limited observations for 
Student Achievement. According to the University of Texas Statistics Department website, 
consensus is lacking regarding the number of observations required per independent variable for 
multiple regression analysis. The numbers in the literature range from 5 to 50. 

Given these considerations, we wanted to at least meet the suggested five-observation 
minimum, but exceed it where possible. The K-12 report card categories served as the 
dependent variables, and the various contextual variables that served as our independent 
predictors all contained 50 observations. For the achievement equation, we did choose 5 
predictor variables, placing us at 6 observations per independent variable. The other equations 
contained fewer predictor variables and contained all 50 observations. Since our intent was 
purely exploratory, we decided to pursue our analysis even if we were very close to our self- 
imposed minimum number of observations per independent variable for the achievement 
equation. As an additional measure, we tracked the statistical significance of the total final 
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models and each predictive variable contributing to each model. This allowed us to offer 
cautionary and conservative remarks regarding each model. 

Choosing Predictor Variables 

In this section, we first offer a general description outlining our selection criteria and 
rationale for choosing predictor variables. We then list each report card category and outline the 
associated predictor variables to each category. Our starting point was to review each report card 
category and discuss whether a measure of income, ethnicity, or both should be included as 
predictor variables for the regression analysis. It is conceivable that the reader could fashion a 
counter argument for excluding a particular variable and including another. We encountered 
some disagreement between ourselves and debated the merits of a list of variables much larger 
than those we eventually decided upon. The construction of the four equations is given below, 
and the data sources for each predictor are shown in Appendix B. 

Report Card Categories and Predictor Variables 

Predictor Variables for Student Achievement: We were interested in whether the often- 
repeated belief that achievement is just a function of income was true among states with the 
Quality Counts 2002 Student Achievement categoiy. We also wanted to test some measure of 
the financial priority states give to K-12 education (K-12 spending per $1000 of state wealth). 

We were also interested in some general educational conditions of the state, such as the number 
of students per teacher, teacher pay, and the percentage of minority students in K-12. 

Teacher Quality Predictor Variables: A host of variables could arguably be deemed as 
influential in teacher quality. Conversations in states around the nation invariably do link 
teacher quality with some notion of pay (e.g.. New Mexico Commission on Higher Education 
and the New Mexico State Department of Education^], 2000), especially in states were teacher 
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pay is far below the national average; thus teacher pay was included as a predictor variable. One 
might argue that large class sizes correlate with teacher quality as well, so the number of students 
per teacher was the second variable we included in the analysis. Finally, a state's emphasis on K- 
12 as a priority also seemed like a reasonable variable for inclusion. K-12 spending per $1000 
was our final predictor variables for this equation. 

Resources Adequacy Predictor Variables : The first issue Resources Adequacy addresses 
is the level at which each student is funded, and second, how much of a financial priority K-12 is 
in the state. We thought that income per capita might be a good predictor of how well students 
and K-12 education is funded. Another initial thought was that states with larger minority 
populations in their K-12 system and larger class sizes might, for some reason, find it more 
challenging to adequately fund K-12 education. A large K-12 minority population would 
indicate a large state minority population. Disparities between minority and majority incomes is 
well established in the literature (e.g. Phillippe[ g i3], 2000, p. 59), so a state with a large minority 
population might find it more challenging to adequately fund K-12 education. 

Resources Equity Predictor Variables : The predictor variables chosen for this equation 
were the same as for the Adequacy equation, but for different reasons. Attention to K-12 
financial equity among districts is largely a function of state philosophy, but we were curious as 
to whether an indicator of state wealth (income per capita) had an influence on whether equity 
was actually achieved. In addition, in states like New Mexico, where minority populations are 
significant, our sense was that there is a stronger inclination toward equitable funding. Thus, we 
included the measure of the percentage of minority students in the K-12 system. Without any 
speculations, we also included teacher pay in the analysis, if for no other reason than that we had 
included it in the Adequacy equation. 
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Results 

The Pearson correlations in Table 1 were obtained from the numerical grades for each 
report card category for all 50 states. We instructed SPSS to compute the pairwise Pearson 
correlations. Student achievement was the only category that did not contain 50 observations, as 
data for 20 states were not available. 

\ 



Put Table 1 about here. 



Grade 8 NAEP 2000 

Student achievement as measured by Grade 8 NAEP 2000 is the first row in Table 1. 

One might initially speculate that Teacher Quality should enhance student achievement. Teacher 
Quality depends on the proportion of those teachers who are certified to teach in their field of 
expertise and on professional development and the strength of teacher education in the state. If 
all of these components were strong, it would seem that student achievement would also be 
strong. We found no correlation between GRADE 8 NAEP 2000 and Teacher Quality, however. 

We also found no significant correlation between Grade 8 NAEP 2000 and Standards and 
Accountability. One interpretation of this is that states may have high Standards and 
Accountability, but it is possible that student achievement is unaffected because some schools 
meet the Standards and Accountability while others simply do not. Without further information, 
we are left to believe that Standards and Accountability may be in effect in.a given state, but the 
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ability of those Standards and Accountability to leverage improvement or behavioral change at 
an institutional level is in question. 

We did find a significant positive relationship between Grade 8 NAEP 2000 and the 
Adequacy with which states fund their K-12 systems. Those states that devote greater resources 
to K-12, spend more per student, and channel a large percentage of tax dollars to K-12 education, 
tended to do well on Grade 8 NAEP 2000 as well. One indication from this is that money does 
matter. Perhaps money is not the only answer to improve student achievement, but our analysis 
suggests that it is part of the answer. 

Conversely, Grade 8 NAEP 2000 had a significant negative correlation with Equity. That 
is, states that do a better job of funding students equally across districts tended to do poorly on 
Grade 8 NAEP 2000. The relationship between equity and achievement may represent the 
multiple tradeoffs and policy dilemmas that states will face as they try to find effective K-12 
funding solutions. Equal funding to students, regardless of district, is a laudable goal indeed. 

But are resources spread too thinly when this done? Going further, perhaps equity and student 
achievement are coincidentally correlated, and any constructed explanation equates to mere 
speculation at best or misguided assumptions at worst. It appears there are no easy answers. 
Standards and Accountability 

States with high Standards and Accountability scores tended to do well on Teacher 
Quality. Neither Standards and Accountability nor Teacher Quality influenced Student 
Achievement, however. Our supposition for this set of facts is twofold. First, states with high 
Standards and Accountability have likely set a strong infrastructure in place regarding content 
specific measurements for student achievement. Teacher Quality, which largely measures 
subject specific training and certification of teachers, appears to support this infrastructure. 
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Secondly, and more importantly, the effects of this infrastructure and strong teacher quality may 
take a few years to manifest in terms of student achievement. This manifestation is not captured 
in the Student Achievement data reported in the Quality Counts 2002 report since the student 
achievement subcategories use 2000 NAEP scores. Standards and Accountability and Teacher 
Quality capture 2001 activity. In sum, it is probably not fair to conclude that strong Teacher 
Quality and strong Standards and Accountability do not positively influence performance and 
student achievement. In line with Evan’s (1996) caution, we note that time may be needed to 
properly evaluate the effect of these factors. 

Backward Stepwise Regression Results 

Backward stepwise regression considers all predictor variables, for the given dependent 
variable, in its initial analysis. Each predictor variable is analyzed and an analysis of variance is 
performed to determine if the complete model (for the given iteration) is significant. The 
predictor variable with the least significance is removed after each step, and then the analysis is 
iterated until all variables are stepped out or a significant model is obtained. Any variable whose 
standard coefficient had a test statistic over .15 was stepped out of our model. This relatively 
high tolerance was set for analysis purposes, but variables approaching such a limit should be 
interpreted with caution. 

Table 2 summarizes the results for the regression equations tested. Four models were 
tested, with each K-12 report card category serving as the dependent variable for each model. 
Included in the table are the multiple correlation coefficients (R) and adjusted coefficients of 
multiple determination (R squared). We looked at both the correlation between the dependent 
variable and the entire model (R) and the amount of variance in each report card category 
explained by the combination of predictor variables in the final model (R squared). The adjusted 
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coefficients of multiple determination are reported in Table 2 since we reported standardized 
coefficients for the predictor variables. 



Put Table 2 about here. 



Student Achievement The Student Achievement model was significant at the .01 level, 
despite only having 30 observations. Only 30 states had complete achievement data, by our 
definition, so all other corresponding cases for predictor variables were excluded. Still, the 
combination of variables retained in our final model explained 65.7% of the variance. The 
model, in total, produced a .832 correlation with Student Achievement. 

As we suspected, income per capita was retained in the model, and according to the test 
statistics for the standard coefficient for income per capita, this variable had the strongest effect 
of all retained variables. K-12 spending per $1,000 of state wealth was also significant. While 
this measure could be a function of state wealth in general, we interpret K-12 spending per 
$1,000 of state wealth more as a measure of a state's priority to K-12 education. It is possible for 
a wealthy state to devote few resources to K-12, or any other state function for that matter. Our 
interpretation, therefore, is that those states that make K-12 a priority by devoting resources to it, 
do enhance achievement to some degree. 

Interestingly, the percentage of minority students in K-12 negatively contributed to 
Student Achievement. This is not surprising given the history of correlation between ethnicity 
and test performance. The implication, therefore, is that states with large minority populations 
must continue to search for strategies to help such populations achieve, assuming that 
standardized tests will continue to be the proxy for success. 
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Teacher Quality : The average number of students per teacher and the priority states give 
to K-12 were both stepped out of the regression equation for teacher quality. The only variable 
retained was average pay for teachers, though the percentage of the variance explained was fairly 
low at 4.7%. Since teacher pay was the only variable retained, the standard coefficient is simply 
the correlation between teacher pay and Teacher Quality. The model, as a whole, was the 
weakest of the four that we ran, and the test statistics for average pay indicate that any 
relationship between teacher pay and Teacher Quality should be interpreted with caution. In one 
respect, since teacher pay was retained in the analysis, we would cautiously offer that there are 
many things not included in our model that affect Teacher Quality, but pay, at least to a certain 
extent, may also bear on that quality. 

Resources Adequacy. The measure of priority states give to K-12 was not included as a 
predictor variable since several of the subcategories that comprise Resources Adequacy directly 
speak to this issue (e.g. percent of total taxable resources spent on education). For every 
equation, our intent was to not include any predictor variables that comprised the report card 
category itself, for it would be a trivial to find a significant relationship between the overall 
grade and the subcategory that is part of that overall grade. 

The model for Resources Adequacy shown in Table 2 explained 22.3% of the variance. 
The predictor variables that comprised the model yielded a moderate .504 correlation with 
Resources Adequacy. Only income per capita was stepped out of our model. This was 
somewhat surprising in that one might guess that wealthy states might more generously fund K- 
12 education. Perhaps the continued climate of tax reductions and reluctance on the part of 
policy makers to even speak of raising taxes makes this a somewhat outmoded supposition. 
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We did find that states with higher teacher pay scored well on Resources Adequacy. It 
seems logical that states that pay teachers well also fund K-12 education well. Minority 
enrollment contributed negatively to the model, however. States that have a large percentage of 
minority K-12 students do not appear to fund the educational systems as well as those states with 
lower minority student populations. This raises several questions: Is this because states with 
higher minority enrollments have fewer resources to fund education? Is this because constituents 
in such states are unable to communicate their needs? Or is this because states simply are not 
funding school systems that have large numbers of minority students? Obviously, such a finding 
needs additional investigation to begin uncovering the negative relationship represented by the 
equation. 

Resources Equity. Our Equity model explained 23.3% of the variance and was 
significant at the .01 level. First, and quite importantly, income per capita produced a negative 
contribution to the overall model. States with high income per capita tend to be less equitable. 
That is, there are wider disparities between districts in terms of funding, even though the state 
tends to be wealthy, relative to other states. This may have to do with anything from state 
philosophy to how local money is retained and distributed. Even if a state does not allocate local 
resources, the Equity grade measures how much of an effort that state makes to focus resources 
on the poorer district. Given this, a conceivable interpretation is that wealthier states do not 
make as much of an effort to equalize funding across districts. 

States with a high percentage of minority students in K-12 and high student to teacher 
ratios appear to be more successful at equalizing funding across districts. The measure of 
minority student percentage was retained in the final model. The indication seems to be that 
states with large K-12 minority populations may work harder at producing equity. 
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When one looks at the effect of percentage of minority population in K-12 on Equity 
versus Adequacy, they are the exact opposite. Perhaps this is because working toward equity, or 
even achieving it, does not imply adequacy. The challenge of adequate funding may well remain 
even if equity exists. 

Discussion 

Research question l (on comprehensiveness) 

The answer to our first question is yes, the Quality Counts 2002 report card grades have 
comprehensiveness; they are not so highly intercorrelated as to appear all to be measuring 
essentially the same trait. Of the ten possible correlations, only three were statistically 
significant, and the highest correlation was a moderate 0.447 between Teacher Quality on the 
one hand and Standards and Accountability on the other. In fact, as we said above, some of the 
lack of correlation raises disturbing questions. The near-zero correlation between Standards and 
Accountability and Teacher Quality on the one hand, and Grade 8 NAEP scores on the other 
should be monitored to see if, over time, a correlation appears. Many clients and public 
managers see student achievement data as a kind of pedagogical bottom line, and their interest in 
Standards and Accountability and Teacher Quality stem purely from their presumed eventual 
effect on those achievement data. 

Furthermore, the possibility that Resources Equity is negatively correlated with other 
grades raises serious questions. Green’s (1994) warnings about policy tradeoffs, cited above, 
seem relevant here, as our findings suggest that there may be tradeoff between equity and student 
achievement. It is information like this that must come to the forefront of policy discussions. 

The issue of equity only initiates the conversation, for the larger solution may encompass 
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funding amounts or where the funding is focused (e g. salaries, class size reductions, or teacher 
preparation and professional development). 

Research question 2 (on functionality) 

The answer to our second research question is also yes, there are demographic variables 
associated with the Quality Counts 2002 grades, but this answer comes with good news and bad 
news. The good news is that with regard to Teacher Quality, Resources Adequacy, and 
Resources Equity, demographic factors accounted for only from four to 23 percent of the 
variance among states. This implies that there is plenty of room for policy to affect the grades. 
Therefore the Quality Counts 2002 report cards have functionality. The bad news is with our 
pedagogical bottom line. Nearly 66% of the between-state variance in Grade 8 NAEP was 
accounted for by our demographic factors. This leaves comparatively little room for policy 
effects. 

But even here there are opportunities for responses other than fatalism. Along with the 
claim of Odden, et al. (1995), cited above, that there is little variation in how districts spend 
money, they recommend that schools begin to spend resources specifically in ways that will 
produce high achievement. But we also cited sources above that seem to counter the claims of 
Odden et al. about what expenditures will and will not lead to high achievement. So it is clear 
that more research and deliberation need to occur in this area. And this is where the Quality 
Counts report cards may not serve us well. Such large data collection projects as Quality Counts 
are seriously constrained by the resources they have to collect data and the availability of those 
data. The National Center for Public Policy in Higher Education (2000, p. 182) said of its report 
cards that it restricted itself to “publicly available information that has been collected by 
government agencies and by nationally recognized private organizations,” and it described 15 
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categories of information that it considered important, but that were unavailable. Quality Counts 
must have faced similar frustrations. It has relied on available data, and sometimes those data 
are poor proxies for the phenomena of interest. For example, we cited above both Finn’s (2002) 
explanation of why pupil teacher/ratio is a less valuable statistic than class size, and Kennedy’s 
(1999) catalog of the weaknesses of student achievement indicators. Studies aiming to assess 
validly the effects of policy on education outcomes will need to look for better indicators than 
the Quality Counts grades. 

In sum, we see Quality Counts 2002 as very a useful starter of very important 
conversations, starter, but we do not believe that it resolves many important questions. 
McDonnell (1997) called for “rational deliberation” about the meaning of such assessment 
systems. Clearly this is not a quick fix. She admits that even proposing it leaves her open to the 
charge of political naivete. But it is precisely because of the indefiniteness and the lack of 
guarantees in her prescription that she is in very good company. John Dewey saw the necessary 
and sufficient criterion for Democracy to be that there are “many interests consciously 
communicated and shared; and there are varied and free points of contact with other modes of 
association” (cited in Noddings, 1995, p. 35). We see no solution other than McDonnell’s to 
such ambiguous implications as we have revealed here. 

But parties contending over the implications of Quality Counts 2002 must accept the 
challenge to do this well. Practitioners must work to overcome their cynicism about 
policymakers’ motives and distance from the problems. Policymakers must eschew the “ugly 
face” of persuasion, when it becomes “intentionally manipulative, robbing people of their 
capacity to think independently” (McDonnell, 1997, p. 400). Practitioners must work to 
understand policymakers’ needs and responsibilities to have accountability data, and 
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policymakers must give up their illusion that accountability data alone will bring about a cheap 
fix for education so that they can ignore the most costly “capacity building.” 

Therefore, the development of a state report card is a worthwhile endeavor. And analyses 
such as ours are necessary to casting a critical eye on report cards and their implications. We see 
scholars working to refine report cards, and we hope that more states will elect to participate in 
the data collection. Report cards can promote enlightened discussion. But creation of report 
cards it is only the beginning of a process. The process must continue with analyses of the report 
card variables and other associated variables, and with rational open-minded deliberation about 
the findings of such analyses. The research reported here is one stage in that process. 
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Table 1 

Pearson correlations (two-tailed tests) among Quality Counts 2002 report card grades 




Grade 8 
NAEP, 2000 


Standards and 
Accountability 


Teacher 

Quality 


Resources. 

Adequacy 


Resources: 

Equity 


Grade 8 




-0.056 


0.056 


0.407* 


-0.446* 


NAEP, 2000 












Standards and 






0.447** 


-0.080 


-0.052 


Accountability 












Teacher 








-0.049 


-0.04 


Quality 












Resources: 










-0.219 


Adequacy 













** Significant at .001 level 
* Significant at .05 level 
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